Fine-Grained Activity Recognition for Assembly Videos
نویسندگان
چکیده
In this letter we address the task of recognizing assembly actions as a structure (e.g. piece furniture or toy block tower) is built up from set primitive objects. Recognizing full range requires perception at level spatial detail that has not been attempted in action recognition literature to date. We extend fine-grained activity setting its generality by unifying and kinematic structures within single framework. use framework develop general method for observation sequences, along with features take advantage assembly's special structure. Finally, evaluate our empirically on two application-driven data sources: 1) An IKEA furniture-assembly dataset, 2) A block-building dataset. On first, system recognizes an average framewise accuracy 70% normalized edit distance 10%. second, which geometric reasoning distinguish between assemblies, attains 23%-a relative improvement 69% over prior work.
منابع مشابه
Fine-Grained Entity Recognition
Entity Recognition (ER) is a key component of relation extraction systems and many other natural-language processing applications. Unfortunately, most ER systems are restricted to produce labels from to a small set of entity classes, e.g., person, organization, location or miscellaneous. In order to intelligently understand text and extract a wide range of information, it is useful to more prec...
متن کاملTennisVid2Text: Fine-grained Descriptions for Domain Specific Videos
Automatically describing videos has ever been fascinating. In this work, we attempt to describe videos from a specific domain – broadcast videos of lawn tennis matches. Given a video shot from a tennis match, we intend to generate a textual commentary similar to what a human expert would write on a sports website. Unlike many recent works that focus on generating short captions, we are interest...
متن کاملHand Detection and Tracking in Videos for Fine-Grained Action Recognition
In this paper, we develop an effective method of detecting and tracking hands in uncontrolled videos based on multiple cues including hand shape, skin color, upper body position and flow information. We apply our hand detection results to perform fine-grained human action recognition. We demonstrate that motion features extracted from hand areas can help classify actions even when they look fam...
متن کاملFine-grained Recognition Datasets for Biodiversity Analysis
In the following paper, we present and discuss challenging applications for fine-grained visual classification (FGVC): biodiversity and species analysis. We not only give details about two challenging new datasets suitable for computer vision research with up to 675 highly similar classes, but also present first results with localized features using convolutional neural networks (CNN). We concl...
متن کاملFine-Grained Activity Recognition with Holistic and Pose Based Features
Holistic methods based on dense trajectories [29, 30] are currently the de facto standard for recognition of human activities in video. Whether holistic representations will sustain or will be superseded by higher level video encoding in terms of body pose and motion is the subject of an ongoing debate [12]. In this paper we aim to clarify the underlying factors responsible for good performance...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE robotics and automation letters
سال: 2021
ISSN: ['2377-3766']
DOI: https://doi.org/10.1109/lra.2021.3064149